Finite Automata and Eecient Lexicon Implementation Finite Automata and Eecient Lexicon Implementation
نویسندگان
چکیده
We describe a general technique for the encoding of lexical functions | such as lexical classiication, gender and number marking, innections and conjugations | using minimized acyclic nite-state automata. This technique has been used to store a Portuguese lexicon with over 2 million entries in about 1 megabyte. Unlike general le compression schemes, this representation allows random access to the stored data. Moreover it allows the lexical functions and their inverses to be computed at negligible cost. The technique can be easily adapted to practically any language or lexical classi-cation scheme, and this task does not require any knowledge of the programs or data structures.
منابع مشابه
Reduction of Computational Complexity in Finite State Automata Explosion of Networked System Diagnosis (RESEARCH NOTE)
This research puts forward rough finite state automata which have been represented by two variants of BDD called ROBDD and ZBDD. The proposed structures have been used in networked system diagnosis and can overcome cominatorial explosion. In implementation the CUDD - Colorado University Decision Diagrams package is used. A mathematical proof for claimed complexity are provided which shows ZBDD ...
متن کاملBetween Finite State and Prolog: Constraint-based Automata and Eecient Recognition of Phrases
This paper describes a new type of automaton that has been developed at CIS Munich for eecient recognition of phrases in text les. The concept of a constraint-based automaton is tailored to sets of phrases where grammaticality depends on morphological agreement conditions. It incorporates features from three sides: traditional nite state techniques, methods from constraint programming, and some...
متن کاملBetween finite state and Prolog: constraint-based automata for efficient recognition of phrases
This note describes a new type of automaton that has been developed at CIS Munich for eecient recognition of phrases in large German corpora. The concept of a constraint-based automaton is tailored to sets of phrases where grammaticality depends on morphological agreement conditions. It incorporates features from three sides: traditional nite state techniques, methods from constraint programmin...
متن کاملSyntactic Analysis by Local Grammars Automata: an Efficient Algorithm
The description of the constraints restricting words' combinations in specific contexts provides helpful grammars for reducing the number of ambiguities of lemmatized texts. These grammars allow to easily eliminate many of the ambiguities without even using complex general syntactic rules involving a lexicon-grammar. Local grammars can be represented in a very natural way by finite state automa...
متن کاملIncremental Construction of Compact Acyclic NFAs
This paper presents and analyzes an incremental algorithm for the construction of Acyclic Nondeterministic Finite-state Automata (NFA). Automata of this type are quite useful in computational linguistics, especially for storing lexicons. The proposed algorithm produces compact NFAs, i.e. NFAs that do not contain equivalent states. Unlike Deterministic Finite-state Automata (DFA), this property ...
متن کامل